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. Abstract 

Distributed Arithmetic Coding (DAC) is an effective implementation of Slepian-Wolf coding, espe- 
cially for short data blocks. To research its properties, the concept of DAC codeword distribution along 
proper and wrong decoding paths has been introduced. For DAC codeword distribution of equiprobable 
binary sources along proper decoding paths, the problem was formatted as solving a system of functional 
equations. However, up to now, only one closed form was obtained at rate 0.5, while in general cases, 
r*"** \ to find the closed form of DAC codeword distribution still remains a very difficult task. This paper 

in . 

proposes three kinds of approximation methods for DAC codeword distribution of equiprobable binary 



in 
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sources along proper decoding paths: numeric approximation, polynomial approximation, and Gaussian 

ON 

approximation. Firstly, as a general approach, a numeric method is iterated to find the approximation 
to DAC codeword distribution. Secondly, at rates lower than 0.5, DAC codeword distribution can be 
well approximated by a polynomial. Thirdly, at very low rates, a Gaussian function centered at 0.5 
is proved to be a good and simple approximation to DAC codeword distribution. A simple way to 
estimate the variance of Gaussian function is also proposed. Plenty of simulation results are given to 
verify theoretical analyses. 
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I. Introduction 

Consider the problem of Slepian-Wolf Coding (SWC) with decoder Side Information (SI), i.e. 
the encoder compresses discrete source X in the absence of Y, discretely-correlated SI. Slepian- 
Wolf theorem states that lossless compression is achievable at rates R > H(X\Y) HI, where 
H(X\Y) is the conditional entropy of X given Y. Conventionally, channel codes, e.g., turbo 
codes [2J or Low-Density Parity-Check (LDPC) codes [3], are used to implement the SWC. 

Ever since a long time ago, Arithmetic Coding (AC) has been proposed as the successor of 
Huffman coding to implement source coding and shows near-entropy performance flU, |0, [|6]|. 
At the same time of high compression efficiency, the AC increases computational complexity 
and noise sensitivity of the bitstream. To reduce computational complexity, Quasi-Arithmetic 
Coding (QAC) has been introduced in [7]. To fight against noise sensitivity, redundancies are 
usually reinjected into the bitstream by different means. In [8], redundancies are reinjected into 
the bitstream in the form of parity-check bits. In O, markers are inserted at known positions 
in the sequence of source symbols. In |[T0l . forbidden intervals are exploited for error detection. 
These approaches fall into the so-called Error Detecting AC (EDAC). The EDAC can be coupled 
with Automatic Repeat reQuest (ARQ) [fTTTl . Ifl2l . [fT3l or channel codes [13 J to support error 
correction. To realize Error Correcting AC (ECAC), sequential decoding of arithmetic codes 
with forbidden intervals is proposed in 031, whose complexity is reduced in [[151 by using 
Trellis-Coded Modulation (TCM) and List Viterbi decoding Algorithm (LVA). A soft decoding 
procedure is described in [fT6l . whose counterpart for QAC appears in [fTTl . The Maximum A 
Posteriori (MAP) decoding procedure is proposed and applied to image transmission lfT8l . |[T9l . 

m, [HQ. 

Recently, the AC is also applied to implement the SWC. One approach is to allow overlapped 
intervals, which mirrors the work in [10]. Such examples include Distributed Arithmetic Coding 
(DAC) BH, BH and Overlapped Quasi- Arithmetic Coding (OQAC) flS). Another approach is 
to puncture some bits of AC bitstream, e.g. Punctured Quasi-Arithmetic Coding (PQAC) 11251 . 
which mirrors the work in [|9l. There are also some variants of the DAC. The symmetric SWC is 
implemented by the time-shared DAC (TS-DAC) ll26l . The rate-compatible DAC is proposed in 
[|27l . Furthermore, decoder-driven adaptive DAC ll28l is proposed to estimate source probabilities 
on-the-fly. 
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We note that the EC AC and the DAC in fact generalize the classic AC in reverse directions. 
The ECAC encodes source X at rates R > H(X) jC > H(X) by introducing forbidden intervals, 
where C is channel capacity. The forbidden intervals, corresponding to forbidden symbols, lead 
to a longer codeword due to narrowed final interval and also inject redundancies into the resulting 
bitstream. The decoder jointly exploits both the received bitstream and known channel parameters 
to reconstruct source X. The DAC encodes source X at rates H(X\Y) < R < H(X) by allowing 
overlapped intervals. The overlapped intervals, corresponding to ambiguous symbols, lead to a 
shorter codeword due to enlarged final interval and also induce ambiguities in the resulting 
bitstream. A soft joint decoder exploits both the received bitstream and Yto reconstruct X. 

Though it is well-known that the classic AC can achieve source entropy H(X) theoretically, 
it is not clear whether the DAC can achieve conditional entropy H(X\Y). If no, what is the 
performance limit of the DAC? Is it possible to improve its performance? If yes, how to realize 
it? Intuitively, before answering these questions, one may need to know how many branches 
will be generated during the DAC decoding. In addition, it may also be helpful to know the 
distribution of Hamming distances between decoding branches and source X. 

As the first step, to analyze the properties of the DAC, ll29ll introduces the concept of 
codeword distribution, which seems promising for answering these questions. DAC codeword 
distribution is a function defined over interval [0, 1). For equiprobable binary sources, both 
codeword distribution along proper decoding paths and codeword distribution along wrong 
decoding paths are researched. For codeword distribution along proper decoding paths, the 
problem is formatted as solving a system of functional equations including four constraints 
[29]. It is affirmed that rate R = 0.5 is a watershed: when R > 0.5, DAC codeword distribution 
is an unsmooth function; while when R < 0.5, DAC codeword distribution is a smooth function. 
Especially, a closed form is obtained at R = 0.5. In spite of these achievements, it remains a very 
difficult task to find the closed form of codeword distribution along proper decoding paths in 
general. As for codeword distribution along wrong decoding paths, only some simulation results 
are reported in [|29l , while problem formulation remains an open issue. It deserves to point out 
that the concept of codeword distribution can be easily extended to the ECAC. 

This paper makes some advances on the work in [|29l . Three approximation methods are 
proposed for codeword distribution of equiprobable binary sources along proper decoding paths: 
numeric approximation, polynomial approximation, and Gaussian approximation. Among them, 
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numeric approximation is a general approach. At low rates (R < 0.5), polynomial approximation 
works well. To reduce computational complexity at very low rates, Gaussian approximation is 
used as an alternative to polynomial approximation. 

This paper is arranged as follows. In Section HO after a brief introduction to binary DAC 
codec, DAC decoding process is analyzed in detail to show the significance of DAC codeword 
distribution. Then the investigated problem is formulated in Section Unl Section |IVj Section 
PVT. and Section |VT] describe in detail numeric approximation, polynomial approximation, and 
Gaussian approximation, respectively, where simulation results are also reported. Finally, Section 
IVIII concludes this paper. 

II. Binary Distributed Arithmetic Coding 

A. Encoding 

Consider a binary source X = {xi}j =1 with bias probability p = Pr(xj = 1). In the classic AC, 
source symbol Xi is iteratively mapped onto sub-intervals of [0, 1), whose lengths are proportional 
to (l—p) and p, giving rate R = H(X). Instead, in the DAC ll22l . E3l . sub-interval lengths are 
proportional to enlarged probabilities (1 — p) 1 and p"', where H(X\Y) / H(X) < j < 1, giving 
rate R = r yH(X) > H(X\Y). For conciseness, we refer to 7 as overlap coefficient hereinafter. 
More specifically, symbols = and X{ = 1 correspond to sub-intervals [0, (1 — p) 1 ) and 
[1 — p 7 , 1), respectively. It means that to fit the [0, 1) interval, the sub-intervals have to be partially 
overlapped. This overlapping leads to a larger final interval, and hence a shorter codeword. 
However, as a cost, the decoder can not decode X unambiguously without Y. 

Note that when 7 > 1/C > 1, where C is channel capacity, it becomes the ECAC. 

B. Decoding 

To describe the decoding process, a ternary symbol set {0, A, 1} is defined, where A represents 
the ambiguous symbol. Let Cx be DAC codeword and be the i-th decoded symbol, then 

'0, Q<C x <l-p 1 

x 4 =<M, <C X <(l-p)\ (1) 

,1, {i-pV<c x <\ 

After Xi is decoded, if Xi = A, the decoder will perform a branching: two candidate branches 
are generated, corresponding to two alternative symbols Xi = and x% = 1. For each new 
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branch, its metric is updated and the corresponding interval is selected for next iteration. To 
reduce complexity, every time a symbol is decoded, the decoder uses the M-algorithm to keep 
at most M paths with the best partial metric, and prunes others |[22Tl . [|23l . Finally, after all 
source symbols are decoded, the path with the best metric is output as the estimate of X. As for 
detailed performance comparisons between DAC and LDPC -based SWC, please refer to ll23l . 

C. Discussion 

It deserves to point out that during DAC decoding, the metric of each path is indeed the 
Hamming distance between this path and SI Y. As we know, each DAC codeword defines a 
set of possible decoding paths and each possible decoding path corresponds to a sequence of 
decoded symbols. However, among all possible decoding paths, there is one and only one proper 
path which corresponds to source X. Let X = {xi} J i=1 be a sequence of decoded symbols. Let 
D(Y,X) be the Hamming distance between Y and X. Similarly, D(X,X) and D(X, Y) are 
also defined. Obviously, 

D(Y,X)<D{X,Y) + D(X,X). (2) 
The task of a DAC decoder is in fact to find a path X' that minimizes D(Y,X), i.e. 

X' = argmin D(Y,X). (3) 

x 

However, this is not always followed by D(X,X') = 0. If D(X, X') ^ 0, then a decoding 
failure occurs. To find the probability of decoding failure, we need to know the distribution of 
D(Y,X) and D(X,X). 

Though it is very difficult to find the distribution of D(Y, X) and D(X, X), this problem can 
be tackled by means of DAC codeword distribution. As shown in [29], if we know codeword 
distributions along proper and wrong decoding paths, it seems promising to find the number of 
possible decoding paths and the distribution of D(Y,X) and D(X,X). 

The rest of this paper makes some advances on DAC codeword distribution along proper 
decoding paths. 

III. Problem Formulation 

To simplify the analysis, we consider an infinite-length, stationary, and equiprobable binary 
source X = {xi}^. As p = 0.5, symbols X{ = and Xi = 1 correspond to sub-intervals [0, q) 
and [1 — q, 1) respectively, where q = 0.5 7 . The resulting rate R = ~fH(X) = 7. 
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9 = l/%/2 

2.5 1 1 1 




u 



Fig. 1. Illustrations of f(u), fo(u), and fi(u) for q — l/y/2. /(it) is symmetric around u = 0.5, i.e. f(u) — f(l — it). f(u), 
/o(tt), and fi(u) have the same shape. f(u) — (fo(u) + /i(u))/2. fo(u) can be obtained by first squeezing f(u) by q times 
along a>axis and then streching f(u) by 1/q times along t/-axis, i.e. fo(u) = f{u/q)/q. fi(u) can be obtained by shifting 
/o(u) right by (1 — g), i.e. fi(u) = /o(u— (1 — q)). Due to the symmetry, fi(u) = /o(l — u). f(u), fo(u), and fi(u) intersect 
at u = 0.5, i.e. /(0.5) = /o(0.5) = /i(0.5). Hence g/(0.5) = /(0.5/g). 



Let be the DAC codeword of X and f(u) (0 < u < 1) be the distribution of Cx, then 

f f(u)du = l. (4) 
Jo 

Due to the symmetry, we have 

f(u) = f(l-u), 0<u<l. (5) 

Symbols x\ = and x\ = \ correspond to intervals [0, q) and [1 — g, 1), respectively. If Xi = 0, 
the remaining sequence X 2 = {xi}°Z 2 Wli l be iteratively mapped onto the sub-intervals of [0, q); 
otherwise, X 2 will be iteratively mapped onto the sub-intervals of [1 — q, 1). Let C\ 2 be the 
DAC codeword of X 2 given X\ = and f (u) be the distribution of C^ 2 , then 

fo(u)du = 1. (6) 

Since X is infinite-length and stationary, f (u) must have the same shape as f(u), i.e., 

f (u) = f(u/q)/q, 0<u<q. (7) 

Similarly, let C\ 2 be the DAC codeword of X 2 given x\ = 1 and be the distribution of 

C^ 2 , then 

f 1 (u) = f (u-(l-q)) = f C~ il ~ q) )/q, (1 -?)<«<!. (8) 



September 28, 2010 



DRAFT 



Due to the symmetry, 

/i(tt) = /o(l-w). (9) 
The relations between f(u), fo(u), and fi(u) can be illustrated by Fig{T] Obviously, 

f(u) = Pr( Xl = 0)/ («) + Pr(x! = = (/o(«) + /i(«))/2. (10) 

Hence, /(w), /o(w)> an ^ /i( M ) intersect at w = 0.5, i.e. /(0.5) = /o(0.5) = /i(0.5). Thus 

g/(0.5) = /(0.5/ 9 ). (11) 

A. Classic AC 

When 9 = 0.5, it is just the classic AC. Then 

f/(2«), 0<m<0.5 

/(«) = <^ • (12) 

L/(2u-l), 0.5<m<1 

It is easy to prove f(u) = 1 (0 < u < 1). This is a uniform distribution, so the classic AC can 
achieve source entropy theoretically. 

B. Distributed AC 

When 0.5 < q < 1, sub-intervals [0, q) and [1 — q, 1) are partially overlapped, so f(u) is a 
piecewise-defined function. 

1) < u < (1 — 9): It this interval, /i(u) = 0, so 

/(«) = /o(«)/2 = f(u/q)/{2q). (13) 

Since /(0) = f(0/q)/(2q), we have /(0) = 0. 

2) 9 < u < 1: In this interval, /o(w) = 0, so 

/(«) = AH/2 = /( "~^~ g) )/(2 g ). (14) 

3) 1 — 9 < u < q: In this interval, we have 

/(«) = 9 ^ 9 • (15) 
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C. A Closed Form of f(u) at q = 1/ v2 

Generally, it is very difficult to obtain the closed form of f(u). In 11291 , only one closed form 
is obtained at q = I/a/2 (i.e. 7 = 0.5): 

< u < ^2 - 1 



/(«) 



3\/2-4 
1 

1-u 
I 3^2 -4 



v / 2-l<n<2- v / 2. (16) 
, 2 - V2 < u < 1 



D. Zeros of f(u) at High Rates 



It is proved in B29l that when 0.5 < q < ^p 1 (corresponds to 0.6942 < 7 < 1), f{^) = 
f(l - ^) = 0, Vn e N. 

IV. Numeric Approximation 

Though a special closed form of /(w) is found for p = 7 = 0.5 in 11291 . the procedure is 
very complex. In general, the closed form of f(u) does not exist. As a universal approach, we 
propose a numeric method for finding f(u). This method is described in detail below. 

A. Discretization 

We divide the interval [0, 1] into N uniform cells. Let A = 1/N. Then f{u) can be approxi- 
mated by f(nA), where n G X N — {0, 1, N}, given a large N. 

B. Initialization 

Let /*''(nA) be the estimate of f(nA) after t iterations. Before iteration, f^(nA) need to 
be initialized. Though arbitrary initialization is allowed, we recommend uniform initialization, 
i.e. /'°'(nA) = 1, where n G X N . 

C. Iteration 

Let L = [N(l — q)\ = N — \Nq~\ and H = \Nq] . Then the iteration is run as follows. 
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1) < n < L: This corresponds to interval < u < (1 — q), hence 



/«va) = /"-"(y/^), (17) 

where 

a, round(x) < a < b 

h a ,b(x) = \ round(x), a < round(x) < b . (18) 

^6, a <b < round(x) 

2) H < n < N: This corresponds to interval q < u < 1. Because L + H = N , 

/W(nA) = /W((JV-n)A). (19) 

L < n < H: This corresponds to interval (1 — g) < u < q, hence 

/W(„A) = /"-"(^^"A)A) + /"-"(^(¥) A ) , (20) 

D. Normalization 

Recall the constraint J* f(u)du — 1, we have 

AT 

£/<«>(nA)A = l, (21) 

ra=0 

i.e. 

N 

J2f (t) (nA) = l/A = N. (22) 



n=0 



Let J2n=o f^K^A) = f2, then fW(nA) should be normalized as below: 



/( .. (nA) = «>. (23) 



Termination 



We use the Mean Squared Error (MSE) between two successive iterations as a measurement 
to terminate the iteration. Let 5 be a small quantity. When 

1 w 

MSEW = — — (fKnA) - f^\nA)f < 5, (24) 

71=0 



the iteration is terminated. 
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jj = 0.5, 7 =0.5 




u u 



(c) (d) 

Fig. 2. Simulation curves of numeric approximation to /(«), where N = 10 5 . All these results coincides with those given 
in 1291 , meaning that numeric approximation is well justified, (a) Evolution of /™(nA) with respect to t for q = l/y/2. As t 
increases, f {t) (nA) converges to f(u). MSE (38) < 1(T 10 . (b) Some results for q G (0.5, (v/5- l)/2]. MSE (586) < 1(T 4 for 
q = 0.51. MSE (70) < 10~ 4 for q = 0.55. MSE (51) < 10" 4 for q = (VS-l)/2. (c) Some results for q 6 ((\Z5-l)/2, 1/V2). 
MSE (S5) < 10~ 9 for q = (v^- 1)/2 + 0.01. MSE (63) < 10" 9 for g = 2/3. MSE (52) < 10* 9 for q = 1/^2-0.01. (d) Some 
results for g £ [l/y/2, 1). MSE (39) < 10" 10 for q = 0.8. MSE (54) < lO" 10 for q = 0.9. MSE (540) < 10" 9 for q = 0.99. 
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F. Simulation Results 

Fig. [2] includes some results regarding numeric approximation. All results reported in Fig. [2] 
are obtained with iV = 10 5 . 

To show how /w(nA) converges to f(u), the evolution of /W(nA) with t is plotted in Fig. 
[2ta). We find that after 38 iterations, successive MSE has been less than 1CT 10 . 

It was affirmed in ll29ll that (v5 — l)/2 and I /y/2 are two watersheds that divide interval 
(0.5,1) of q into three sub-intervals: (0.5, {y/h - l)/2], {{y/h - l)/2, 1/V2), and [1/^2, 1). 
because /(w) shows very different properties in these three sub-intervals. As in ll2~9~ll . for each 
sub-interval of q, some simulation results are reported in Figs.[2tb)-(d). All these results coincide 
with those given in [29] perfectly. Fig. (2b) confirms the zeros of f(u) at high rates. Fig. (2d) 
shows that f{u) becomes smooth at low rates. 

In different sub-intervals, numeric approximation shows very different simulation precision and 
computational complexity. Firstly, we consider simulation precision. For q G (0.5, {y/E - l)/2], 
tens of iterations are needed to make successive MSE less than 10~ 4 , while for q G ((y/E - 
l)/2, 1/ y/2), tens of iterations have made successive MSE less than 10~ 9 . For q G [I /y/2, 1), 

tens 

of iterations can even make successive MSE less than 10~ 10 . Secondly, we consider computational 
complexity. We find that q — 1/ y/2 needs the fewest iterations. As q departs from 1 / y/2 (increase 
or decrease), computational complexity increases, i.e. more iterations are needed to reach the 
same successive MSE. Thirdly, as q approaches to 0.5 or 1, simulation precision is sharply 
degraded, or in other words, computational complexity increases sharply. For example, when 
q = 0.51, 586 iterations are needed to make successive MSE less than 10~ 4 , while for other q 
in the same sub-intervals (e.g. 0.55 and (v5— l)/2), tens of iterations are enough to reach the 
same precision. Similar phenomenon is also observed for q = 0.99. 

It may be an interesting issue to improve simulation precision and accelerate convergence 
speed of f{u), especially for q close to 0.5 or 1. 

V. Polynomial Approximation at Low Rates 

It was affirmed in [|29l that f(u) is a smooth function when q > l/y/2, i.e. R < 0.5. This 
property suggests that polynomials may be good approximation to f{u) at low rates (R < 0.5). 
Below we propose polynomial approximation to f(u) for I /y/2 < q < 1. 
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To simplify the analysis, we exploit the symmetry and consider only the left half of f(u) 

ff(u/q)/(2q), 0<u< (1-g) 

/(*) = /(f) + /(^) n ^ ■ (25) 

We rewrite (1251) as 

(2qf(qu), 0<u< Vl 
/(«) = I . (26) 

{2qf(qu)-f(u-v 1 ), V!<u<0.5 

where v n — (1 — q)/q n , n G N. Note that t>i < 0.5 when g > l/y/2. Hence, f(u) is a piecewise- 

de fined function over interval [0,0.5]. 

At first, in sub-interval [0, vi], f(u) can be obtained by solving functional equation f{u) = 

2qf(qu) m 

f( u ) = (j)(u) = e(u)u x , 0<u<vx, (27) 

where A = (1 - 7V7 and Q(u) = Q(uq k ), Vfc G Z. 

Then, we need to determine f(u) in sub-interval [vi, 0.5]. Because qu < u and u — V\ < u, it 
is possible to recursively map sub-interval [t>i,0.5] onto sub-interval [0, Vi) by scaling down or 
shifting u, over which f{u) has been given by (|27T) . i.e. 

7(gii) = <j>(qu), Vi<u< v 2 

(28) 

J{u — vi) = 4>{u — v\), v 1 < u < 2v\ 
This is the key to solving this problem. 

It is easy to prove that u — v\ < qu for u G [vi, 0.5]. Hence, 

f(u) = 2qf(qu)-(f){u-v 1 ), Vl <u<2v x . (29) 

On solving 2v x = 0.5, we obtain q = 0.8. Hereinafter, to facilitate our description, we divide 
interval l/y/2 < q < 1 into two sub-intervals l/y/2 < q < 0.8 (corresponding to 0.5 < 2vi) and 
0.8 < q < 1 (corresponding to 2v% < 0.5). 

A. l/y/2 <q < 0.8 

In this sub-interval, since 0.5 < 2v%, we have 

{(p(u), < u < v 1 

(30) 
2qf(qu) — <p(u — Vi), v± <u < 0.5 

Hence, we need to consider only the term 2qf(qu). Depending on the relations between v n and 
0.5, this sub-interval can be further divided into three smaller sub-intervals. 
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1) 0.5 < v 2 : On solving v 2 = 0.5, we obtain q = \/3— 1, so this sub-interval corresponds to 
l/\/2 < q < y/S — 1. Since 0.5 < t> 2 , we have qu < V\ for u G [t>i,0.5], i.e. /(gu) = <fi(qu). 
Remember 4>{u) = 2q<p(qu). Thus 

{4>(u), < u < v 1 

(31) 
— (f)(u — vi), V\ < u < 0.5 

As affirmed in ll29l . f(u) is a smooth function for q > l/y/2. Hence we approximate O(w) by 

a const c and then obtain 

f CU A , < U < Vi 

f(u) « ^ . (32) 

Uw A - c(tt -t>i)\ «i < u < 0.5 

Now we need to determine c. Let us integrate f(u) over interval [0, 0.5] 

0.5 / a-0.5 /-0.5 



f(u)du = c I / u du — {u — v\) du 

\Jo Jvi 
c ( w A+l|0.5_( w _ Vi )A+l|0.5) 



Thus, 



Due to A + 1 = I/7, 



A + l 

c(0.5 A+1 - (0.5 - ^i) A+1 ) 
A+l 

0.5(A + 1) 
0.5 A+1 - (0.5 -wi) A+1 ' 

1 



0.5. (33) 



(34) 



(35) 



2 7 (0.5( 1 /7) _ (0.5-Ui)(V7))- 
2) v 2 < 0.5 < v 3 : On solving v 3 = 0.5, we obtain q « 0.77, so this sub-interval corresponds 
to — 1 < q < 0.77. At first, it can be obtained directly 

{4>{u), < U < V\ 

(36) 
4>(u) - 0(m - Vi), V\ < u < v 2 

Then, for u G [v 2 , 0.5], we have qu G [t> 1, t> 2 ], i.e. f(qu) = 4>(qu) — 4>(qu — Vi). Thus 

/(it) = 2qf(qu) - (j)(u - v 1) 

= 2q(<fi(qu) — <fi(qu — Vi)) — (p(u — v\), v 2 < u < 0.5. (37) 

Because 2qcp(qu — v\) = 2qcj)(q{u — v 2 )) = (p(u — v 2 ), we obtain 

2 

f{u) = (j){u)-^(j){u-v i ), v 2 <u<0.5. (38) 
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Therefore, we can obtain the following approximation 



' ' cu x , < u < v i 

CU X — c(u — t>l) A , Vi < U < V 2 
2 

cu x — (u — Vi) X , V 2 < u < 0.5 



(39) 



i=l 



where 



1 



(40) 



2 7 (0.5(V7)_X:ti(0.5-« i )( 1 /7))' 
3) t> 3 < 0.5 < v 4 : On solving t> 4 = 0.5, we obtain q ps 0.8, so this sub-interval corresponds 
to 0.77 < q < 0.8. By iterations, we can obtain 



/(«) ~ < 



( cu x , < u < v i 



cn A — c(?i — t>!) A , t> i < U < V 2 

2 

CU X — (u — fj) A , f 2 < U < V 3 



(41) 



i=l 
3 



— (u — Vi) X , V 3 < u < 0.5 



where 



2 7 ( .5( 1 /7) _ (0.5 



v 



(42) 



5. 0.8 < q < 1 

The problem becomes very complex in this sub-interval because f{u — v±) = 4>{u — V\) does 
not hold for u G [2v±, 0.5] so that we need to deal with not only 2qf(qu) but also f(u — vi). 
Let us consider a simple case first, i.e. v 1 < 0.5 — V\ < v 2 , which corresponds to sub-interval 

0.8 < q < a/273- We have u - v x e [v 1 , v 2 ] for u e [2v 1 , 0.5]. Hence 



f{u — vi) = 4>{u — vi) — 4>(u — 2t>i), 2t>i < u < 0.5. 



(43) 



Therefore, the problem becomes 

r 2qf(qu), 

/(«) = 



< u < v x 
V\ < u < 2v\ 



2qf(qu) - 4>(u - Vt), v 1 < u < 2v i . (44) 

2qf(qu) - - Vl ) - <p(u - 2^)), 2v x < u < 0.5 
Now we need to deal with only 2qf(qu), which has been discussed in detail in Section IV-AI 



September 28, 2010 



DRAFT 



15 



g = 0.725 




g = 0.75 




(a) (b) 



g = 0.775 9 = 0.8 




(c) (d) 

Fig. 3. Comparisons of polynomial approximation with numeric approximation, where N = 10 5 and 8 = 10~ 10 for numeric 
approximation. These results show that polynomial approximation fits numeric approximation very well. Especially, as q increases, 
polynomial approximation almost coincides with numeric approximation, (a) q = 0.725. (b) q = 0.75. (c) q = 0.775. (d) q = 0.8. 



For a/2/3 < q < 1, the idea is the same but the procedure becomes more and more 
complicated as q increases. Therefore, at very low rates, polynomial approximation is not a 
good choice. 

C. Simulation Results 

Some examples of polynomial approximation have been included in Fig. |3] Considering the 
complexity, only the results for l/\/2 < q < 0.8 are reported. Fig. [3] shows that in general, the 
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curves of polynomial approximation fit those of numeric approximation very well. Especially, 
as q increases, the curves of polynomial approximation almost coincide with those of numeric 
approximation. In addition, Fig. [3|a) also shows the affirmation in [|29ll may fail because q > 
l/y/2 does not guarantee smooth f(u). Nevertheless, f(u) does become less irregular as q 
increases. 

VI. Gaussian Approximation at Very Low Rates 

As pointed out in Section [V] that as q increases, polynomial approximation to f(u) becomes 
very complex. Thus a simpler approximation method is needed at very low rates. Through 
experiments, we observe that f[u) becomes bell-shaped at very low rates [|29ll . This phenomenon 
suggests that a Gaussian function centered at 0.5 may be good approximation to f(u), i.e. 

Obviously, the problem now boils down to how to estimate a 2 for given q. 
A. Estimation of a 2 

Here we propose a simple method to estimate a 2 by exploiting qf (0.5) = /(0.5/g) [Fig. [fl. 
For a large q, we have 

9/(0.5) « -^L- (46) 

V27T<7 



/(0.5/g) « ->=- exp ( -^^f ) . (47) 



and 

i ( (i-q) 2 

exp — 

^7rt7 V 8q 2 a 

Hence, 

8q 2 a 

Therefore 



9 « exp | - hrri I • (48) 



- 2 - 4^£. (49) 
8^ in q 

B. Simulation Results 

Some examples of Gaussian approximation are included in Fig. HI These plots show that as q 
increases, the curves of Gaussian approximation become closer and closer to those of numeric 
approximation. Especially, when q = 0.99, the curve of Gaussian approximation almost coincides 
with that of numeric approximation. All these results confirm that Gaussian approximation does 
work well at very low rates. 
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9 = 0.85 9 = 0.9 




0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 

it u 



(a) (b) 



9 = 0.95 9 = 0.99 




(c) (d) 

Fig. 4. Comparisons of Gaussian approximation with numeric approximation, where N = 10° for numeric approximation. As 
q increases, Gaussian approximation becomes more and more accurate, (a) q — 0.85. S = 10~ 10 for numeric approximation, (b) 
q = 0.9. S — 10~ 10 for numeric approximation, (c) q = 0.95. S = 10 -10 for numeric approximation (d) q — 0.99. S = 10~ 9 
for numeric approximation. 



VII. Conclusion 

This paper proposes three approximation methods for DAC codeword distribution of equiprob- 
able binary sources along proper decoding paths. These methods are well justified by simulation 
results. The related software is available on PTfl . 

Nevertheless, there remain many open issues. Firstly, how to format the problem for codeword 
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distribution along wrong decoding path? Secondly, for general (non-equiprobable or M-ary) 
sources, how to format the problem? Thirdly, can we find the number of possible decoding paths 
as well as the distributions of D(X, X) and D(Y, X), for a given DAC code of X. Finally, it is 
an interesting issue to define codeword distribution for the ECAC. 
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